379 research outputs found

    Improving Data Quality Through Effective Use of Data Semantics

    Get PDF
    Data quality issues have taken on increasing importance in recent years. In our research, we have discovered that many “data quality” problems are actually “data misinterpretation” problems – that is, problems with data semantics. In this paper, we first illustrate some examples of these problems and then introduce a particular semantic problem that we call “corporate householding.” We stress the importance of “context” to get the appropriate answer for each task. Then we propose an approach to handle these tasks using extensions to the COntext INterchange (COIN) technology for knowledge storage and knowledge processing.Singapore-MIT Alliance (SMA

    A Lightweight Ontology Approach to Scalable Interoperability

    Get PDF
    There are many different kinds of ontologies used for different purposes in modern computing. Lightweight ontologies are easy to create, but difficult to deploy; formal ontolgies are relatively easy to deploy, but difficult to create. This paper presents an approach that combines the strengths and avoids the weaknesses of lightweight and formal ontologies. In this approach, the ontology includes only high level concepts; subtle differences in the interpretation of the concepts are captured as context descriptions outside the ontology. The resulting ontology is simple, thus it is easy to create. The context descriptions facilitate data conversion composition, which leads to a scalable solution to semantic interoperability among disparate data sources and contexts

    A Systems Theoretic Approach to the Security Threats in Cyber Physical Systems Applied to Stuxnet

    Get PDF
    Cyber Physical Systems (CPSs) are increasingly being adopted in a wide range of industries such as smart power grids. Even though the rapid proliferation of CPSs brings huge benefits to our society, it also provides potential attackers with many new opportunities to affect the physical world such as disrupting the services controlled by CPSs. Stuxnet is an example of such an attack that was designed to interrupt the Iranian nuclear program. In this paper, we show how the vulnerabilities exploited by Stuxnet could have been addressed at the design level. We utilize a system theoretic approach, based on prior research on system safety, that takes both physical and cyber components into account to analyze the threats exploited by Stuxnet. We conclude that such an approach is capable of identifying cyber threats towards CPSs at the design level and provide practical recommendations that CPS designers can utilize to design a more secure CPS

    Studying the tension between digital innovation and cybersecurity

    Get PDF
    With increasing economic pressures and exponential growth in technological innovations, companies are increasingly relying on digital technologies for innovation and value creation. But, with increasing levels of cybersecurity breaches, the trustworthiness of many established and new technologies is of concern. Consequently, companies are aggressively increasing cybersecurity of their existing and new digital assets. Most companies have to deal with these priorities simultaneously which are frequently conflicting, and creating tensions. This paper introduces a framework for evaluating these risk/reward trade-offs. Through a survey and interviews, companies are positioned in different quadrants on an innovation/cybersecurity matrix overlaid with the negative impact of cybersecurity controls on the innovative projects. The paper analyzes the industry level, firm level, technology management, and technology maturity factors that affect these trade-offs. Finally, a set of recommendations is provided to help a company to evaluate its positioning on the matrix, understand the underlying factors, and how to better manage these trade-offs. Keywords: Cybersecurity, digital innovation, CIOs

    Semantic Integration Approach to Efficient Business Data Supply Chain: Integration Approach to Interoperable XBRL

    Get PDF
    As an open standard for electronic communication of business and financial data, XBRL has the potential of improving the efficiency of the business data supply chain. A number of jurisdictions have developed different XBRL taxonomies as their data standards. Semantic heterogeneity exists in these taxonomies, the corresponding instances, and the internal systems that store the original data. Consequently, there are still substantial difficulties in creating and using XBRL instances that involve multiple taxonomies. To fully realize the potential benefits of XBRL, we have to develop technologies to reconcile semantic heterogeneity and enable interoperability of various parts of the supply chain. In this paper, we analyze the XBRL standard and use examples of different taxonomies to illustrate the interoperability challenge. We also propose a technical solution that incorporates schema matching and context mediation techniques to improve the efficiency of the production and consumption of XBRL data

    Evaluating and Aggregating Data Believability across Quality Sub-Dimensions and Data Lineage

    Get PDF
    Data quality is crucial for operational efficiency and sound decision making. This paper focuses on believability, a major aspect of data quality. The issue of believability is particularly relevant in the context of Web 2.0, where mashups facilitate the combination of data from different sources. Our approach for assessing data believability is based on provenance and lineage, i.e. the origin and subsequent processing history of data. We present the main concepts of our model for representing and storing data provenance, and an ontology of the sub-dimensions of data believability. We then use aggregation operators to compute believability across the sub-dimensions of data believability and the provenance of data. We illustrate our approach with a scenario based on Internet data. Our contribution lies in three main design artifacts (1) the provenance model (2) the ontology of believability subdimensions and (3) the method for computing and aggregating data believability. To our knowledge, this is the first work to operationalize provenance-based assessment of data believability

    Addressing the Challenges of Aggregational and Temporal Ontological Heterogeneity

    Get PDF
    In this paper, we first identify semantic heterogeneities that, when not resolved, often cause serious data quality problems. We discuss the especially challenging problems of temporal and aggregational ontological heterogeneity, which concerns how complex entities and their relationships are aggregated and reinterpreted over time. Then we illustrate how the COntext INterchange (COIN) technology can be used to capture data semantics and reconcile semantic heterogeneities in a scalable manner, thereby improving data quality.Singapore-MIT Alliance (SMA

    Reconciliation of temporal semantic heterogeneity in evolving information systems

    Get PDF
    The change in meaning of data over time poses significant challenges for the use of that data. These challenges exist in the use of an individual data source and are further compounded with the integration of multiple sources. In this paper, we identify three types of temporal semantic heterogeneity. We propose a solution based on extensions to the Context Interchange framework, which has mechanisms for capturing semantics using ontology and temporal context. It also provides a mediation service that automatically reconciles semantic conflicts. We show the feasibility of this approach with a prototype that implements a subset of the proposed extensions

    Measuring Data Believability: A Provenance Approach

    Get PDF
    Data quality is crucial for operational efficiency and sound decision making. This paper focuses on believability, a major aspect of quality, measured along three dimensions: trustworthiness, reasonableness, and temporality. We ground our approach on provenance, i.e. the origin and subsequent processing history of data. We present our provenance model and our approach for computing believability based on provenance metadata. The approach is structured into three increasingly complex building blocks: (1) definition of metrics for assessing the believability of data sources, (2) definition of metrics for assessing the believability of data resulting from one process run and (3) assessment of believability based on all the sources and processing history of data. We illustrate our approach with a scenario based on Internet data. To our knowledge, this is the first work to develop a precise approach to measuring data believability and making explicit use of provenance-based measurements

    Context Interchange as a Scalable Solution to Interoperating Amongst Heterogeneous Dynamic Services

    Get PDF
    Many online services access a large number of autonomous data sources and at the same time need to meet different user requirements. It is essential for these services to achieve semantic interoperability among these information exchange entities. In the presence of an increasing number of proprietary business processes, heterogeneous data standards, and diverse user requirements, it is critical that the services are implemented using adaptable, extensible, and scalable technology. The COntext INterchange (COIN) approach, inspired by similar goals of the Semantic Web, provides a robust solution. In this paper, we describe how COIN can be used to implement dynamic online services where semantic differences are reconciled on the fly. We show that COIN is flexible and scalable by comparing it with several conventional approaches. With a given ontology, the number of conversions in COIN is quadratic to the semantic aspect that has the largest number of distinctions. These semantic aspects are modeled as modifiers in a conceptual ontology; in most cases the number of conversions is linear with the number of modifiers, which is significantly smaller than traditional hard-wiring middleware approach where the number of conversion programs is quadratic to the number of sources and data receivers. In the example scenario in the paper, the COIN approach needs only 5 conversions to be defined while traditional approaches require 20,000 to 100 million. COIN achieves this scalability by automatically composing all the comprehensive conversions from a small number of declaratively defined sub-conversions.Singapore-MIT Alliance (SMA
    • …
    corecore